SpatioTemporal focus for skeleton-based action recognition
نویسندگان
چکیده
Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition due to their powerful ability model data topology. We argue that the performance of recent proposed methods is limited by following factors. First, predefined graph structures shared throughout network, lacking flexibility and capacity multi-grain semantic information. Second, relations among global joints not fully exploited local convolution, which may lose implicit joint relevance. For instance, actions such as running waving performed co-movement body parts joints, e.g., legs arms, however, they located far away physical connection. Inspired attention mechanism, we propose a contextual focus module, termed MCF, capture associated relation information from parts. As result, more explainable representations for different skeleton sequences can be obtained MCF. In this study, follow common practice dense sample strategy input brings much redundancy since number instances has nothing do with actions. To reduce redundancy, temporal discrimination TDF, developed sensitive points dynamics. MCF TDF integrated into standard GCN network form unified architecture, named STF-Net. It noted STF-Net provides capability robust movement patterns these topology structures, based on context aggregation dependency. Extensive experimental results show our significantly achieves state-of-the-art three challenging benchmarks NTU-RGB+D 60, 120, Kinetics-Skeleton.
منابع مشابه
Leveraging the Path Signature for Skeleton-based Human Action Recognition
Human action recognition in videos is one of the most challenging tasks in computer vision. One important issue is how to design discriminative features for representing spatial context and temporal dynamics. Here, we introduce a path signature feature to encode information from intra-frame and inter-frame contexts. A key step towards leveraging this feature is to construct the proper trajector...
متن کاملSpatio-Temporal Graph Convolution for Skeleton Based Action Recognition
Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed mul...
متن کاملSpatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Dynamics of human body skeletons convey significant information for human action recognition. Conventional approaches for modeling skeletons usually rely on hand-crafted parts or traversal rules, thus resulting in limited expressive power and difficulties of generalization. In this work, we propose a novel model of dynamic skeletons called SpatialTemporal Graph Convolutional Networks (ST-GCN), ...
متن کاملSpatiotemporal Residual Networks for Video Action Recognition
Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos. Recently, Residual Networks (ResNets) have arisen as a new technique to train extremely deep architectures. In this paper, we introduce spatiotemporal ResNets as a combination of these two approaches. Our novel architecture generalizes ResNets for the spatiotemporal domain by intro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition
سال: 2023
ISSN: ['1873-5142', '0031-3203']
DOI: https://doi.org/10.1016/j.patcog.2022.109231